CDS6334 - Lecture 1 Notes

1. What is Computer Vision?

Computer Vision is a branch of Artificial Intelligence that enables computers to detect, process, analyze and understand visual information from images and videos.

🧠 Remember:

Human → Understands visual scenes naturally
Computer Vision → Tries to teach machines to do the same

2. Goal of Computer Vision

Convert image/video data into meaningful information for decision making.

Recognize objects
Recognize people
Understand scenes
Interpret activities

Exam Keyword:
Image/Video → Information

3. Human Vision vs Computer Vision

Human Vision	Computer Vision
Naturally interprets scenes	Requires algorithms and data
Handles ambiguity well	Can be fooled easily
Fast understanding	Needs computation

🧠 Human vision remains more robust than machines in many situations.

4. Fields Related to Images and Videos

Field	Main Purpose
Computer Graphics	Create images
Image Processing	Manipulate images
Computer Vision	Understand images

🧠 Shortcut:

Graphics → Create
Processing → Improve
Vision → Understand

5. Three Main Areas of Computer Vision

Measurement
Perception and Interpretation
Search and Organization

Measurement: Recover information about the real 3D world from images.

Perception & Interpretation: Recognize objects, people, activities and scenes.

Search & Organization: Find and organize visual data efficiently.

6. Measurement

Reconstructing 3D models from multiple images.

Computer vision estimates properties of the real world from visual data.

7. Perception and Interpretation

Detecting faces, recognizing objects and understanding scenes.

🧠 Think:
"What is in the image?"

8. Search and Organization

Google Image Search and image retrieval systems.

Visual data can be indexed, searched and categorized automatically.

9. Applications of Computer Vision

Autonomous vehicles
Face recognition
Biometrics
Optical Character Recognition (OCR)
Medical imaging
Industrial inspection
Retail automation
Satellite image analysis
Space exploration
Gaming and interaction systems

Exam Tip:
Be able to explain at least 3 real-world applications.

10. Biometrics and Recognition

Technology	Purpose
Face Recognition	Identity verification
Fingerprint Recognition	User authentication
Iris Recognition	High-security identification

11. Optical Character Recognition (OCR)

OCR converts images containing text into machine-readable text.

License plate recognition, digit recognition, scanned document conversion.

12. Why Computer Vision is Difficult

Computer Vision is an Ill-Posed Problem.

The real world is 3D, but images are only 2D projections.

🧠 Remember:

Many different real-world situations can produce the same image.

13. Digital Images

A digital image is composed of pixels arranged in rows and columns.

🧠 Pixel = Smallest unit of a digital image.

14. Image Processing Tasks

Pixel Manipulation
Filtering
Restoration
Enhancement
Edge Detection
Segmentation

These are core low-level image processing operations.

15. Colour Image Processing

Colour Spaces
Colour Manipulation
Colour Analysis

RGB is the most common colour representation.

16. Higher-Level Vision Tasks

Representation
Description
Recognition
Interpretation
Semantic Understanding

Modern deep learning systems perform object detection and scene understanding.

17. Emerging Applications

Visual Captioning
Visual Question Answering (VQA)
Egocentric Vision
Fashion Recommendation Systems
Autonomous Systems

18. Final Exam Summary

Most Important Points

Computer Vision: Image/Video → Information
Image Processing: Image → Image
Graphics: Create images
Three Areas: Measurement, Perception, Search
Challenge: 3D world projected to 2D images
OCR: Image text → Digital text
Applications: Face recognition, medical imaging, autonomous vehicles
Digital Image: Collection of pixels

CDS 6334 - Visual Image Processing

Lecture 1: Introduction to Images